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O ' Abstract 

<D ' 

£^ , We give new proofs for the hardness amplification of efficiently samplable predicates 

and of weakly verifiable puzzles which generalize to new settings. More concretely, in 
the first part of the paper, we give a new proof of Yao's XOR-Lemma that addition- 
ally applies to related theorems in the cryptographic setting. Our proof seems simpler 
than previous ones, yet immediately generalizes to statements similar in spirit such as 
Qh the extraction lemma used to obtain pseudo-random generators from one-way functions 

, [Hastad, Impagliazzo, Levin, Luby, SIAM J. on Comp. 1999]. 

^ ' In the second part of the paper, we give a new proof of hardness amplification for 

weakly verifiable puzzles, which is more general than previous ones in that it gives the 
right bound even for an arbitrary monotone function applied to the checking circuit of 
psj , the underlying puzzle. 

' Both our proofs are applicable in many settings of interactive cryptographic protocols 

because they satisfy a property that we call "non-rewinding". In particular, we show 
that any weak cryptographic protocol whose security is given by the unpredictability of 
single bits can be strengthened with a natural information theoretic protocol. As an 
example, we show how these theorems solve the main open question from [Halevi and 
Rabin, TCC2008] concerning bit commitment. 
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1 Introduction 



■ i 

^ ■ In this paper, we study two scenarios of hardness amplification. In the first scenario, one 

is given a predicate P(x), which is somewhat hard to compute given x. More concretely: 
PrL4(x) = P(x)] < 1 — | for any A in some given complexity class, where typically 5 is not 
too close to 1 but at least polynomially big (say, poly ( n ) < 5 < 1 — poly ( n ) ) • One then aims to 
find a predicate which is even harder to compute. 

In the second scenario, one is given a computational search problem, specified by some 
relation R(x,y). One then assumes that no algorithm of a certain complexity satisfies 
Pr[(x,A(x)) G R] > 1 — 6, and again, is interested in finding relations which are even harder 
to satisfy. It is sometimes the case that R may only be efficiently computable given some 
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side information generated while sampling x. Such problems are called "weakly verifiable 
puzzles" . 

Our aim is to give proofs for theorems in both scenarios which are both simple and 
versatile. In particular, we will see that our proofs are applicable in the interactive setting, 
where they give stronger results than those previously known. 



1.1 Predicates 

Overview and previous work Roughly speaking, Yao's XOR-Lemma |Yao82j states that 
if a predicate P(x) is somewhat hard to compute, then the fc-wise XOR P® k (x\, . . . ,Xk) := 
P(x\) • • • © P(xk) will be even harder to compute. While intuitive, such statements are 
often somewhat difficult to prove. The first proof of the above appears to be by Levin |Lev87] 
(see also [GNW95J). In some cases, even stronger statements are needed: for example, the 
extraction lemma states that one can even extract several bits out of the concatenation 
P{x\)P[x2) . . . P(xk), which look pseudorandom to a distinguisher given x\, . . . , x^. Proving 
this statement for tight parameters is considered the technically most difficult step in the 
original proof that one-way functions imply pseudorandom generators [HILL99]. Excluding 
this work, the easiest proof available seems to be based on Impagliazzo's hard-core set theorem 
Imp95] , more concretely the uniform version of it [Hol05 , BHK09J . A proof along those lines 
is given in |Hol06b[ IHHR06| . Similar considerations are true for the more efficient proof that 
one-way functions imply pseudorandom generators given by Haitner et al. |HRV10] . 



Contributions of this paper In this paper, we are concerned with statements of a similar 
nature as (but which generalize beyond) Yao's XOR-Lemma. We give a new theorem, which 
is much easier to prove than the hard-core set theorem, and which is still sufficient for all the 
aforementioned applications. 

Our main observation can be described in relatively simple terms. In the known proof 
based on hard-core sets ( |Imp95 , IHol05] ) , the essential statement is that there is a large set 
S, such that for x S S it is computationally difficult to predict P{x) with a non-negligible 
advantage over a random guess. Proving the existence of the set S requires some work 
(basically, boosting, as shown in [KS99J). We use the idea that the set S can be made 
dependent on the circuit which attempts to predict P. The existence of a hard set S for a 
particular circuit is a much easier fact to show (and occurs as a building block in some proofs 
of the hard-core theorem). For our idea to go through, S has to be made dependent on some 
of the inputs to C as well as some other fixed choices. This technique of switching quantifiers 
resembles a statement in [BSW03J, where Impagliazzo's hard-core set theorem is used to show 
that in some definitions of pseudo-entropy it is also possible to switch quantifiers. 

Besides being technically simpler, making the set S dependent on C has an additional 
advantage. For example, consider a proof of the XOR Lemma. To get a contradiction, a 
circuit C is assumed which does well in predicting the XOR, and a circuit D for a single 
instance is built from C. On input x, D calls C as a subroutine several times, each time 
"hiding" x as one of the elements of the input. Using our ideas, we can ensure that x is 
hidden always in the same place i, and even more, the values of the inputs x\, . . . , Xi—\ are 
constant and independent of x. This property, which we call non-rewinding, is useful in the 
case one wants to amplify the hardness of interactive protocols. 
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We remark that in this paper we are not concerned with efficiency of XOR-Lemmas in 
the sense of derandomizing them (as in, e.g., [IW97, IJK06, IJKW08]). 



1.2 Weakly Verifiable Puzzles 

Overview and Previous Work The notion of weakly verifiable puzzles was introduced 
by Canetti et al. [CHS05J. A weakly verifiable puzzle consists of a sampling method, which 
produces an instance x together with a circuit T(y), checking solutions. The task is, given x 
but not necessarily T, to find a string y for which T(y) = 1. One-way functions are an 
example: T(y) just outputs 1 if f(y) = x (since V depends on the instance it can contain x). 
However, weakly verifiable puzzles are more general, since V is not given at the time y has 
to be found. 

Canetti et al. show that if no efficient algorithm finds solutions with probability higher 
than 5, then any efficient algorithm finds k solutions simultaneously with probability at 
most 5 k + e, for some negligible e. This result was strengthened by [IJ K09] . showing that 
requiring some 5' > 5 + l/poly(n) fraction of correct answers already makes efficient algo- 
rithms fail, if k is large enough. Independently of the current work, Jutla [Jut lO] improved 
their bound to make it match the standard Chernoff bound. A different strengthening was 
given in [HR08] . where it was noted that the algorithm in [CHS05] has an additional property 
which implies that it can be applied in an interactive cryptographic setting, also they studied 
how much easier solving a weakly verifiable puzzle becomes if one simply asks for a single 
correct solution from k given puzzles. Also independently of our work, Chung et al. [CLLY09J 
give a proof for the threshold case (similar to Jutla) which is also applicable in an interactive 
setting; however, their parameters are somewhat weaker than the ones given by most other 
papers. Finally, |DIJ K09j gives yet another strengthening: they allow a weakly verifiable 
puzzle to have multiple solutions indexed by some element q, and the adversary is allowed to 
interactively obtain some of them. They then study under what conditions the hardness is 
amplified in this setting. 

Contributions of this paper In this work, we present a theorem which unifies and 
strengthens the results given in [CHS051 IHB081 11.1X091 IJutlOl ICLLY09] : assume a mono- 
tone function g : {0, l} k — > {0, 1} specifies which subpuzzles need to be solved in order to 
solve the resulting puzzle (i.e., if c±, . . . , are bits where Cj indicates that a valid solution 
for puzzle i was found, then g(c\, . . . , cy.) = 1 iff this is sufficient to give a valid solution for 
the overall case.) Our theorem gives a tight bound for any such g (in this sense, previous 
papers considered only threshold functions for g). Furthermore, as we will see our proof is 
also applicable in an interactive setting (the proofs given in [IJKOal I.TutlOj do not have this 
property). Our proof is heavily inspired by the one given in [CHS05]. 

1.3 Strengthening Cryptographic Protocols 

Overview and Previous Work Consider a cryptographic protocol, such as bit commit- 
ment. Suppose that a non-perfect implementation of such a protocol is given, which we would 
like to improve. For example, assume that a cheating receiver can guess the bit committed 
to with some probability, say 3/5. Furthermore, suppose that a cheating sender can open the 
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commitment in two ways with some probability, say 1/5. Can we use this protocol to get a 
stronger bit commitment protocol? 

Such questions have been studied in various forms both in the information theoretic and 
the computational model [DKS99I IDFMS041 IDNR04I IHol05| iHRMj IWul071 [HEM] . 

However, all of the previous computational work except [HR08] focused on the case where 
the parties participating in the protocol are at least semi-honest, i.e., they follow the protocol 
correctly (this is a natural assumption in the case for the work on key agreement [DNR04, 
IHol05[ IHR05| . as in this case the participating parties can be assumed to be honest). An 
exception to this trend was the work by Halevi and Rabin [HR08], where it was shown that 
for some protocols, the information theoretic bounds also apply computationally. 

The above are results in case where the protocol is repeated sequentially. The case where 
the protocol is repeated in parallel is more complicated |BIN97j IPW07j IPV07j IHPWPlOj 
IHai09llCLlU] . 

Contributions of this paper We explicitly define "non-rewinding" (which was, however, 
pointed to in [HR08J) which helps to provide a sufficient condition for transforming complex- 
ity theoretic results into results for cryptographic protocols. Using, the above results, and 
specifically that the above results are non-rewindable, we show that we can strengthen any 
protocol in which the security goal is to make a bit one party has unpredictable to the other 
party, in the case where an information theoretic analogue can be strengthened. We also 
study interactive weakly verifiable puzzles (as has been done implicitly in [HR08]), and show 
that natural ways to amplify the hardness of these work. 

We only remark that our proof is applicable to parallel repetition for non-interactive 
(two-round) protocols (e.g. CAPTCHAs). 

2 Preliminaries 

Definition 1. Consider a circuit C which has a tuple of designated input wires labeled 
2/1,... ,Uk- An oracle circuit D(-) with calls to C is non-rewinding if there is a fixed % and 
fixed strings y\ to y*-\ sucn that for any input y to D, all calls to C use inputs (y*, . . . , y) 
on the wires labeled yi, . . . , y*. 

Definition 2. Let C be a circuit which has a block of input wires labeled x. An oracle 
circuit D which calls C (possibly several times) treats x obliviously if the input x to D is 
forwarded to C directly, and not used in any other way in D. 

We say that an event happens almost surely if it has probability 1 — 2 _n poly(n). 

We denote by [m\ the set {1, . . . , m}. The density of a set S C {0, l} n is [i(S) = ^ . We 
sometimes identify a set S with its characteristic function S : {0, l} n — > {0,1}. We often 
denote a tuple (xi,X2, ■ ■ ■ , £Cfc) by x^ k \ 

If a distribution fi over some set is given, we write x ^— \i to denote that x is chosen 
according to fx. We sometimes identify sets with the uniform distribution over them. We 
let us be the Bernoulli distribution over {0,1} with parameter S, i.e., Pr x< _^ i [x = 1] = 5. 
Furthermore, iiH is the distribution over {0, l} k where each bit is i.i.d. according to fig- 

When two interactive algorithms A and B are given, we will denote by (A, B)a the output 
A has in an interaction with B, and by (A,B)b the output which B has. We sometimes 
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consider probabilities like Pr[(A, B)a = (A, B) b], in which case the probability is over random 
coins of A and B (if any), but they are chosen the same on the left and the right hand side. 

3 Efficiently Samplable Predicates 

3.1 Single Instance 

3.1.1 Informal Discussion 

Fix a predicate P : {0, l} n — > {0,1} and a circuit C(x,b,r) which takes an arbitrary x € 
{0, l} n , a bit 6 € {0, 1}, and some randomness r as input. We may think of C as a circuit 
which tries to distinguish the case b = P(x) from the case 6 = 1 — P(x). Our idea is to 
identify a set S for which we can show the following: 

1. If x is picked randomly from S, then Pr[C(x, P(x), r) = 1] ~ Pr[C(x, 1 — P(x),r) = 1]. 

2. C can be used to predict P(x) for a uniform random x correctly with probability close 
to 1 - 

On an informal level, one could say that S explains the hardness of computing P from C's 
point of view: for elements from S the circuit just behaves as a uniform random guess, 
on the others it computes (or, more accurately, helps to compute) P. Readers familiar 
with Impagliazzo's hardcore lemma will notice the similarity: Impagliazzo finds a set which 
explains the computational difficulty of a predicate for any circuit of a certain size. Thus, 
in this sense Impagliazzo's theorem is stronger. The advantage of ours is that the proof is 
technically simpler, and that it can be used in the interactive setting (see Section [33]) which 
seemingly comes from the fact that it helps to build non-rewinding proofs. 

3.1.2 The Theorem 

The following theorem formalizes the above discussion. It will find S by producing a circuit 
which recognizes it, and also produces a circuit Q which uses C in order to predict P. 

Theorem 3. Let P : {0, l} n — > {0, 1} be a computable predicate. There is an algorithm 
Gen which takes as input a randomized circuit C(x,b,r) and a parameter e, and outputs two 
deterministic circuits Q and S, both of size size(C) -poly(n, -), as well as 8 € [0, 1], such that 
almost surely the following holds: 

Large Set: S(x,P(x)) recognizes a set S* = {x\S(x, P(x)) = 1} of density at least n(S*) > 
6. 

Indistinguishability: For the above set S* we have 

I Pr [C(x t P(x),r) = i\- Pr \C(x,P'(x),r) = 111 < e, (1) 

x-^{0,l}'\r x-^{0,l} n ,r 

where P'(x) := P(x) © S(x), i.e., P' is the predicate which equals P outside S and 
differs from P within S. 

Predictability: Q predicts P well: Pr \Q(x) = P(x)] > 1 . 

x<-{o,i} n 2 
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Figure 1: Intuition for the proof of Theorem [3l In both pictures, on the vertical axis, the 
advantage of the circuit in guessing right over a random guess is depicted. The elements are 
then sorted according to this quantity. The point x* is chosen such that the area of A is 
slightly smaller than the area of B (as in equation (O). 

Additionally, these algorithms have the following properties: 

1. Unless 5 = 1 algorithm Q predicts slightly better^ Pr[Q(x) = P(x)] > 1 — § + 4 . 

2. If P is efficiently samplable (i.e., pairs (x,P(x)) can be generated in polynomial time), 
Gen runs in time poly(n, -). 

3. Gen, S, and Q can be implemented with oracle access to C only (i.e., they do not use 
the description of C ). 

4- When thought as oracle circuits, S and Q use the oracle C at most O(^) times. Also, 
they both treat x obliviously, and their output only depends on the number of 1 's obtained 
from the oracle calls to C and, in case of S, the input P{x). 

Before we give the proof, we would like to mention that the proof uses no new techniques. 
For example, it is very similar to Lemma 2.4 in [Hol05] . which in turn is implicit in |Lev87t 
IGNW95] (see also Lemma 6.6 and Claim 7 on page 121 in |Hol06b] ) . Our main contribution 
here is to give the statement and to note that it is very powerful. 

Proof Overview. We assume that overall C(x, P(x), r) is more often 1 than C(x,l — 
P(x),r). Make S the largest set for which the Indistinguishability property is satisfied as 
follows: order the elements of {0, l} n according to A x := Pr r [C(x, P(x),r) = 1] — Pr r [C(x, 1 — 
P(x),r) = 1], and insert them into S sequentially until both Pr a;< _5 jr ,[C(x, P(x), r) = 1] > 
^ > ^x^s,r[C{x, 1 — P(x),r) = 1] and indistinguishability is violated. Then, it only remains to 
describe Q. For any x ^ S note that Pr[C(x, P(x), r) = 1] — Pr[C(x,l — P(x),r) = 1] > e, 
as otherwise x could be added to S. Thus, for those elements P(x) is the bit b for which 
Pr[C(x,6, r) = 1] is bigger. In this overview we assume that Pr[C(x,6, r) = 1] can be found 
exactly, so we let Q(x) compute the probabilities for b = and 6 = 1, and answer accordingly; 

1 This implies that 8 > ^ , which can always be guaranteed. 
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we will call this rule the "Majority Rule". Clearly, Q(x) is correct if x £ S, and in order to 
get "predictability" , we only need to argue that Q is not worse than a random guess on S. 

Consider now Figure Q] (a), where the elements are ordered according to A x . The ar- 
eas depicted A and B are roughly equal, which follows by the way we chose S (note that 
Pi x ^ s ,r[C(x,P(x),r) = 1] -Pr^ 5 ,r[C(x,l - P{x),r) = 1] = E X ^ S [A X }). 

At this point our problem is that the majority rule will give the incorrect answer for all 
elements for which A x < 0, and as shown in Figure Q] (b), this can be almost all of S, so that 
in general the above Q does perform worse than a random guess on 5. The solution is to note 
that it is sufficient to follow the majority rule in case the gap is bigger than A x * . In the full 
proof we will see that if the gap is small so that — A x * < Pr[C(x, 0, r) = 1] — Pr[C(x, 1, r) = 
1] < A x * then a randomized decision works: the probability of answering b = is 1 if the 
gap is — A x *, the probability of answering b = is if the gap is A x *. When the gap is in 
between then the probability of answering b = is linearly interpolated based on the value 
of the gap. So for example, if the gap is 0, then 6 = with probability A bit of thought 
reveals that this is exactly because the areas A and B in Figure [1] are almost equal. 

In the full proof, we also show how to sample all quantities accurately enough (which is 
easy) and how to ensure that S is a set of the right size (which seems to require a small trick 
because as defined above is not computable exactly, and so we actually use a different 
quantity for A x ). We think that the second is not really required for the applications later, 
but it simplifies the statement of the above theorem and makes it somewhat more intuitive. 

Proof. We describe algorithm Gen. First, obtain an estimate 

A :« Pr[C(x, P(x),r) = 1] - Pv\C(x, 1 - P(x),r) = 1] (2) 

r,x r,x 

such that almost surely A is within e/4 of the actual quantity. If |A| < 3e/4, we can 
return 5 = 1, S = {0, l} n , and a circuit Q which guesses a uniform random bit. If A < — 3e/4 
replace C with the circuit which outputs 1 — C in the following argument. Thus, from now 
on assume A > 3e/4 and that the actual quantity is at least e/2. 

Sample random strings n, . . . ,r m for C, where m = 100n/e 2 , and let C'(x,b,i) be the 
circuit which computes C(x, b, rj). Using a Chernoff bound, we see that for all x E {0, l} n 

Pt[C(x, P(x),r) = 1] - Pt[C(x, 1 - P(x),r)] = 1] = 

r r 

Pr [C'(x, P(x),i) = 1] - Pr [C'{x, 1 - P(x),i)] = 1] ± e/4 (3) 

i£[m] ig[m] 

almost surely. 

Define, for any x, 

A x := Pr [C'{x,P{x),n) = 1] - Pr [C'{x, 1 - P{x), r<) = 1]. (4) 

iG[m] iS[)7l] 

Because we define A x using C instead of C, we can compute A^ exactly for a given x. Now, 
order the x according to A^: let x\ -< x<i if A Xl < A X2 , or both A Xl = A X2 and x\ <l X2, 

2 It may be instructive to point out another rule which does not work: if one produces a uniform random 
bit in case the gap is smaller than A x * then elements in the region marked A with negative gap larger than 
A x » are problematic. 
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where <l is the lexicographic ordering on bitstrings. We can compute x\ -< x% efficiently 
given (xi,P(xi)) and (x 2 , P(x 2 )). 

We claim that we can find x* such that almost surely (we assume e > 10 • 2~ n , otherwise 
we can get the theorem with exhaustive search) 

- < - y a x < -. (5) 

20 ~ 2 n ^ 10 w 

x^<x* 

We pick 50n/e candidates, then almost surely one of them satisfies ([5]) with a safety margin 
of e/50. For each of those candidates we estimate ^^ x ^ x * A x U P t° an error of e/100, 
and keep one for which almost surely ([5]) is satisfied. We let S(x,P(x)) be the circuit which 
recognizes the set S* := {x\x ^ x*}, estimate 5' := \S*\/2 n almost surely within an error 
of e/1000, and output 5 := 5' — e/1000. The situation at this moment is illustrated in Figure[TJ 
and it is clear that the properties "large set" and "indistinguishability" are satisfied. 
We next describe Q. On input x, Q calculates (exactly) 

Pr [C'(x, 1, i) = l]- Pr [C'(x, 0, i) = l] = (2P(x) - 1)A X . (6) 

If (2P(x) — 1)A X > A x * (where A x * is defined by (HJ) for the element x* which defines S), 
then output 1, if (2P(x) — l)A x < —A x * output 0. If neither of the previous cases apply, 
output 1 with probability ±(1 + (2P( ^~ 1)A3: ). 

To analyze the success probability of Q, we distinguish two cases. If x ^ S, we know 
that A x > A x *. Therefore, in this case, we get the correct answer with probability 1. If x G S, 
it is also easy to check that this will give the correct answer with probability max{^(l + 
^),0}, and thus, on average ^ J2xeS max{i(l + J^),0} > j^ExesH 1 + ^ 35> 

using ©. In total, we have probability at least fi(S)(^ + ^) + (1 — /x(5*)) of answering 
correctly. Since n(S) > 5, this quantity is at least 1 — |, which implies "predictability". 
It is possible to make Q deterministic by trying all possible values for the randomness and 
estimating the probability of it being correct. 

In order to get the additional property 1, we first run the above algorithm with input e/3 
instead of e. If 5 > 1 — 2e/3, we instead output the set containing all elements and return 1 
in place of 5. Note that indistinguishability still holds because we only add a fraction of 2e/3 
elements to S. If 6 < 1 — 2e/3, we enlarge S by at least e/2 and at most 2e/3; this can be 
done by finding a new candidate for x* as above. We then output the new set and 5' := 6+ 1. 

The additional properties 2, 3 and 4 follow by inspection of the proof. □ 

3.2 Multiple instances 
3.2.1 Informal Discussion 

We explain our idea on an example: suppose we want to prove Yao's XOR-Lemma. Thus, 
we are given a predicate P : {0, l} n — > {0, 1} which is somewhat hard to compute, i.e., 
PrfC^ 1 )^) = P(x)] < 1 — | for any circuit C^ 1 ) coming from some family of circuits (the 
superscript (1) should indicate that this is a circuit operating on a single instance). We 
want to show that any circuit C^® k ^ from a related family predicts P{x{) © • • • © P{xk) from 
(xi, . . . , Xk) correctly with probability very close to ^, and aiming for a contradiction we now 
assume that a circuit C(® fc ) exists which does significantly better than this is given. 
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As a first step, we transform C(® fc ) into a circuit C^ k \x±,bi,X2, 62, ... , Xk,bk) as follows: 
QW invokes C^® k \x\, . . . ,x k ) and outputs 1 if the result equals b\ © ■ ■ ■ © otherwise it 
outputs 0. We see that we would like to show Pt[C^(xi, P(xi), . . . , Xk, P(x k )) = 1] ~ \- 

Here is the key idea: we apply Theorem [3] sequentially on every position i of C^ k > . Done 
properly, in each position one of the following happens: (a) we can use to predict P{x) 
from x with probability at least 1 — |, or (b) we find a large set S* such that if Xi G S* , 
behaves roughly the same in case bi equals P(xi) and in case bi is a uniform random bit. 
If (a) happens at any point we get a contradiction and are done, so consider the case that 
(b) happens k times. Recall now how 

was built from C^® k ^\ it compares the output of 
to b\ © ■ ■ ■ © b k . If Xi lands in the large set for any i we can assume that bi is a random 
bit (and it is very unlikely that this happens for no i). Then, C^ outputs 1 exactly if C^® k ' 
correctly predicts a uniform random bit which is independent of the input to C^® k \ The 
probability such a prediction is correct is exactly ^, and overall we get that C(® fc ) is correct 
with probability close to ^. 

The theorem gives the formal statement for C^ k \ we later do the transformation to C^ k ^ 
as an example. 

3.2.2 The Theorem 

Fix a predicate P : {0,1}™ — > {0,1} and a boolean circuit C^ k \xi, b%, . . . ,Xk, b).). We are 
interested in the probability that the circuit outputs 1 in the following Experiment 1: 

Experiment 1: 

Vie{l,...,k}:xi^{0,l} n 
V*€ {l,...,fc} :6j :=P{xi) 
r <- {0,1}* 

output CW(xi,6i, . . .,x k ,b k ,r) 

We will claim that there are large sets SI , . . . , St with the property that for any Xi which 
falls into S*, we can set bi to a random bit and the probability of the experiment producing 
a 1 will not change much. However, we will allow the sets S* to depend on the Xj and bj 
for j < i; we therefore assume that an algorithm GenS is given which produces such a set on 
input U = (x 1 ,b 1 , . . . ,Xi-x,bi-i). 

Experiment 2: 

for i := 1 to k do 

ti := (x 1 ,bi,. . . ,Xi-i,bi-i) 
S* := GenS(ii) 
Xi <- {0, l} n 

if Xi G S* then bi <- {0, 1} else b { := P{xi) fi 
end for 

r <- {0,1}* 

output C( k \x 1 ,b 1 , . . .,x k ,b k ,r) 

Theorem [4] essentially states the following: assume no small circuit can predict P(x) from 
x with probability 1 — |. For any fixed circuit C^ k \ any e, and any k there is an algorithm 
GenS which produces sets S* with n(S*) > 5 and such that the probability that Experiment 1 
outputs 1 differs by at most e from the probability that Experiment 2 outputs 1. 
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Theorem 4. Let P be a computable predicate, k, - E poly(n) parameters. There are two 
algorithms Gen and GenS as follows: Gen takes as input a randomized circuit and 
a parameter e and outputs a deterministic circuit Q of size size(Cw) ■ poly(ra) as well as 
5 £ [0,1]. GenS takes as input a circuit a tuple ti, and a parameter e and outputs a 

deterministic circuit St^x^b) of size(C^) ■ poly(n). After a run of Gen, almost surely the 
following properties are satisfied: 

Large Sets: For any value of ti := (aci, &i, . . . , scf-i, foj-i) the circuit St i {xi,P{xi)) recog- 
nizes a set S* := {xi\S(ti, Xi, P(xi)) = 1}. The probability that in an execution of 
Experiment 2 we have /jl(S*) < 5 for any of the S* which occur is at most e. 

Indistinguishability: Using sets S*. as above in Experiment 2 gives 

\Pr\Experiment 1 outputs 1] — Pi[Experiment 2 outputs 1 < e. (7) 

Predictability: Q predicts P well: Pr \Q(x) = P(x)] > 1 . 

x<-{o,i}™ 2 

Additionally, these algorithms have the following properties: 

1. Unless 5 = 1 algorithm Q predicts slightly better: Pi[Q(x) = P(x)] > 1 — | + j^. 

2. If P is efficiently samplable (i.e., pairs (x,P(x)) can be generated in polynomial time), 
Gen and GenS run in time poly(n). 

3. Gen, GenS, S^, and Q can be implemented with oracle access to C only (i.e., they don't 
use the description of C ). 

4- When thought of as oracle circuits, St t and Q use the oracle C at most O(^r) times. 
Also, they both treat x obliviously and are non-rewinding. Finally, their output only 
depends on the number of l's obtained from the oracle calls to C and, in case of S^, 
the input P(x) . 

Proof. For any fixed tuple ti = (xi,bi, . . . , consider the circuit C ti (x,b,r) which 

uses r to pick random Xj for j > i, and runs C^ k \ti, x, b, x, + i, P(xi + i), . . . ,Xk,P(xj c ))^ We 
let GenS be the algorithm which invokes Gen with parameter 4r from Theorem [3] on the 
circuit Ctf and then returns the circuit recognizing a set from there. 

We next describe Gen: For I = nk/e iterations, pick a random % £ {0, . . . , k — 1}, use 
the procedure in Experiment 2 until loop i, and run algorithm Gen from Theorem [3] with 
parameter This yields a parameter 5 and a circuit Q. We output the pair (Q,S) for the 
smallest 5 ever encountered. Since k and e are polynomial in n, almost surely every time 
Theorem[3]is used the almost surely part happens. Thus, we get the property "predictability" 
(and in fact the stronger property listed under additionally) immediately. We now argue 
"large sets": consider the random variable 5 when we pick a random i, simulate an execution 
up to iteration i of Experiment 2, then run Gen from Theorem [31 Let 5* be the ^-quantile 

^Formally, Ct i may not be a small circuit because at this point we do not assume P to be efficiently 
samplable, and Ct i seems to need to use r to sample pairs (xj, P(xj)) for j > i. However, we can think of Cti 
as oracle circuit with oracle access to P at this moment. Inspection of the previous proof shows that later we 
can remove the calls to P, as the Xj with j > i can be fixed. 
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of this distribution, i.e., the smallest value such that with probability | the value of 5 is 
at most 5*. The probability that a value not bigger than 5* is output by Gen is at least 
1 — (1 — x) > 1 — 2 _n , in which case "large sets" is satisfied. 

We show "indistinguishability" with a standard hybrid argument. Consider the Experi- 
ment Hj\ 

Random Experiment Hj: 

for i := 1 to k do 

U ■= (xi,bx,. . . 
S* := GenS(^) 
x % <- {0, 1}™ 

if i < j and Xi £ S% then 

h <- {o, 1} 

else 

h := P( Xi ) 
end if 
end for 

r^{0,l}* 

output (x\ , h , . . . , x k , b k , r ) 

Experiment f^o is equivalent to Experiment 1, Experiment H k is the same as Experi- 
ment 2. Applying Theorem [3] we get that for every fixed x\, . . . , Xj-i, &i, . • . , bj-\, almost 
surely 



Pr [C(*)(xi ) 6i > ...,x i _i,6 i _i,x ij 6y 1} , . . . , x fc , P(x fc )) = 1]- 

Pr [C( k \x 1 ,bi,...,x j - 1 ,b j „ 1 ,x j ,b ( f ) ,...,x k ,P(x k )) = l] <e/4k, (8) 



where 6^ is chosen as b^ ^ = P(xj) in experiment Hj—i, and 6^- is chosen the same way 
as bj is chosen in experiment Hj (in Theorem [3] the bit is flipped, but when using a uniform 
bit instead of flipping it the distinguishing probability only gets smaller). Applying the 
triangle inequality k — 1 times we get that almost surely the difference of the probabilities in 
Experiment 1 and Experiment 2 is at most |. Since "almost surely" means with probabilities 
1 — 2 _n poly(n) > 1 — |, we get "indistinguishability". 

We already showed the additional Property 1. Properties 2,3, and 4 follow by inspection. 

□ 

3.3 Example: Yao's XOR-Lemma 

As a first example, we prove Yao's XOR-Lemma from Theorem HI We will give the proof for 
the non-uniform model, but in fact it would also work in the uniform model of computation. 

Theorem 5 (Yao's XOR-Lemma). Let P : {0, 1}™ — > {0, 1} be a predicate, such that for all 
circuits Q of size at most s: 

Pr[Q( x ) = P(x)] < 1 - S -. (9) 
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Then, for all circuits of size s/ poly(n, k, -r): 

Pr[C(® fc )(x 1; ...,x k ) = P{ Xl ) • • • P(x k )} < 1 + (1 - 5') k + e'. (10) 

Proof. Assume a circuit C(® fc ) which contradicts (|10p is given, we will obtain a circuit Q which 
contradicts ([9]). For this, let C(x\, bi, — , x^, &&) be the circuit which runs a?i , . . . , a?fc j 

and outputs 1 if the result is the same as b\ © • • • © b k ■ We apply Theorem d] setting the 
parameter e to e'/2, which produces (among other things) a parameter 5. We assume that 
the 3 properties which almost surely hold do hold (otherwise run Gen again). In case 5 < 5' , 
we use Q to get a contradiction. Otherwise, we get 

Pr[C(x!, ...,x k ) = P( Xl ) © • • • © P(x k )] = Pr[C(x 1 ,P(x 1 ), ...,x k , P(x k )) = 1] (11) 

e' 

< Pr[C outputs 1 in Experiment 2] + — (12) 

< Pr[C outputs 1 in Experiment 2 and all sets S* were of density at least 5] + e 

(13) 

<l + (l-6) k + e' . (14) 

□ 



3.4 Example: Extraction Lemma ( |HILL99j ) 



Roughly speaking, the construction of a pseudorandom generator from an arbitrary one- 
way function proceeds in two steps (see |Hol06aj for a more detailed description of this 
view). First, using the Goldreich-Levin theorem |GL89j . one constructs a pseudo-entropy 
pai<& (/, P), which is a pair of functions / : {0, l} n -> {0, l}P ol y( n ), P : {0, l} n -»• {0, 1} such 
that for all efficiently computable A, 

Pr[A(/(x)) = P(x)] < 1 - |, (15) 

for some non-negligible J', and which satisfies some additional information theoretic property 
(the information theoretic property ensures that predicting P{x) from f(x) is a computational 
problem, and (fl~5|) does not already hold because / is, say, a constant function). 

Second, given independently sampled instances x\, . . . ,x k , the extraction lemma then 
says that extracting (5' — bits from the concatenation P(xi) . . . P(x k ) will give a string 
which is computationally indistinguishable from a random string. Due to the information 
theoretic property above, once one has the extraction lemma, it is relatively easy to get a 
pseudo-random generator. In the following we will prove this extraction lemma. 

A technicality: the predicate which is hard to predict in this case is supposed to have 
input f(x) and output P(x). However, in reality this does not have to be a predicate: / is 
not always injective (in fact, for / obtained as above it will not be). Most works avoid this 
problem by now stating that previous theorems also hold for randomized predicates. This is 
often true, but some of the statements get very subtle if one does it this way, and statements 



While HILL99 constructs a PEP implicitly, the definition and name was introduced in [HHR06] . 
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which involve sets of "hard" inputs very much so. We therefore choose to solve the problem 
in a different way. We consider circuits which try to predict P(x) from x, but are limited in 
that they first are required to apply / on x and not use x anywhere else. Now, we have a 
predicate again, but it can only be difficult for this restricted class. However, since the oracle 
circuit Q in Theorem 2] treats x obliviously we stay within this class. 

Lemma 6 (Extraction Lemma, implicit in [HILL99] ). Let (/, P) be a pair of functions sat- 
isfying ( Tf5|) for any polynomial time machine A and set k = 1/n 3 . Let Ext(m, s) be a strong 
extractor which extracts m = (5' — ^)k bits from any k-bit source with min-entropy (5' — ^)k 
such that the resulting bits have statistical distance at most 2~ n from uniform. Then, for any 
polynomial time A 

Pr[A(/(x!), . . . , f(x k ), s, Ext(P(xi) • • • P(x k ), s)) = 1] - Pr[A(/(xi), . . . , f(x k ), s, U m ) = 1] 

(16) 

is negligible. 

Proof. Assume otherwise, and let e(n) be inverse polynomial and infinitely often smaller than 
the distinguishing advantage of A. We consider the circuit C^ k \x\, b\, . . . , x k , b k ) which first 
applies / on every Xi, then pick s at random, computes z := Ext(&i, . . . , b k , s), and executes 
A(f(xi),...,f(xk),s,z). We apply Theorem [J] on C using parameter |, which produces, 
among other things, a parameter 5. Consider first the case 5 < 8' — Then, there is 
a circuit Q which predicts P(x) from x and uses x obliviously in C. This implies that 
the resulting circuit evaluates f(x) for any input x and ignores the input otherwise; we can 
therefore strip off this evaluation, and get a circuit which contradicts (fT5l) . In case 5 > — 
we run Experiment 2. If all sets which occur in the experiment are of size at least 5 (and 
this happens with probability at least 1 — e/2), then we can use a Chernoff-Bound to see that 
with probability 1 — 2~ n ( n \ at least (5 — j-)k > (5' — ^)k of the Xj land in their respective 
set S+. Thus, in this case the extractor will produces a z which is 2~^( n )-close to uniform 
and the indistinguishability property of Theorem H] implies that (]16p is negligible. □ 

3.5 Cryptographic Protocols which output single bits 

Again we start with an example: consider a slightly weak bit commitment protocol, where 
the receiver can guess the bit the sender committed to with probability 1 — | . In such a 
case, we might want to strengthen the scheme. For example, in order to commit to a single 
bit b, we could ask the sender to first commit to two random bits r\ and r2, and then send 
6® r\ ® r2 to the receiver. The hope is that the receiver has to guess both r\ and r^ correctly 
in order to find b, and so the protocol should be more secure. 

In the case where the protocol has some defect that sometimes allows a sender to cheat, we 
might also want to consider the protocol where the sender commits twice to b, or, alternatively, 
that he commits to n, then to and sends both b © t\ and b © r2 to the receiver. In this 
case, one can hope that a cheating receiver still needs to break the protocol at least once, 
and that the security should not degrade too much. 

Just how will the security change? We want to consider a scenario in which the security 
is information theoretic. We can do this by assuming that instead of the weak protocol, a 
trusted party distributes a bit X to the sender and some side information Z to the receiver. 
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The guarantee is that for any /, Pr[/(Z) = X] < 1 — |. In such a case, one can easily obtain 
bounds on the security of the above protocols, and the hope is that the same bounds hold 
in the computational case. The theorem below states that this is indeed true (for protocols 
where the security consists of hiding single bits). 

We remark that while the two aforementioned examples of protocol composition are al- 
ready handled in [HR08] (their result applies to any direct product and any XOR as above), 
Theorem [8] handles any information theoretic amplification protocol as long as it can be 
implemented efficiently. 

Definition 7. A pair (X, Z) of random variables over {0, 1} x Z, where Z is any finite set, 
is 5-hiding if 

z 

max Pr[/(Z) = X] < 1 - -. (17) 

Theorem 8. Let a cryptographic protocol (which we think of as "weak") W = (A\y,Bw) be 
given in which Ay/ has as input a single bit c. Assume that there is a function 5 such that 
for any polynomial time adversary B^ there is a negligible function v such that 

Pr \(A w (x),B* w ) B = x}<l- 6 - + u(n), (18) 

X*r- {0,1} I 

where the probability is also over the coins of Ay/ and By^ (if any). 

Let further an information theoretic protocol I = (Aj, Bj) be given. In I, Aj takes k input 
bits (X%, . . . , X k ) and has a single output bit. Furthermore, assume that I is hiding in the 
sense that for k independent 5 -hiding random variables (Xi,Zi), any (information theoretic) 
adversary B}, and for some function n(k): 

Pr[<Aj(Xi, . . .,X k ),B*j(Zi, Z k )) A = (AjiX,, X k ),BKZi, Z k )) B ] < I + V (k). 

(19) 

Let S = (As,Bs) be the protocol where A and B first execute k(n) copies ofW sequen- 
tially, where A uses uniform random bits as input. Then, they run a single execution of 
protocol I. In the execution to I, A uses his k input bits to the weak protocols as input. The 
output of A in S is the output of A in the execution of I. We also need that (Aj,Bj) and 
k(n) are such that I can be run in time poly(n) for k = k(n). 

Then, for any polynomial time B$ there is a negligible function v' such that 

Pr[(A 5 , B* S ) A = (A s , B* S ) B ] < X - + r,(k) + v\n) . (20) 

Proof. Let x 6 {0, l} n be the concatenation of the randomness which A uses in an execution 
of the protocol W and his input bit c. We let P : {0, l} n — > {0, 1} be the predicate which 
outputs c = P(x). 

In order to obtain a contradiction, we fix an adversary B s for the protocol S which 
violates (i20j) . We would like to apply Theorem[H For this, we define C^ k \x\, b\, . . . , x k , b k ) 
as follows: first simulates an interaction of B s with As, where As uses randomness Xi 
in the ith invocation of the weak protocol W . After this, B s is in some state in which it 
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expects an invocation of the information theoretic protocol. simulates this information 
theoretic protocol, but it runs Aj with inputs b\, . . . , bk instead of the actual inputs to the 
weak protocols. In the end, B* s produces a guess for the output bit of As, and outputs 
1 if this guess equals the output of Ai(px, . . . , bk) in the simulation. 

In Experiment 1 of Theorem [U b{ = P{xi) is used, and so exactly simulates an 
execution of the protocol S. Since we assume that B* s contradicts (j20j) . we see that the 
probability that outputs 1 in Experiment 1 is, for infinitely many n and some constant c 
at least \ + rj(k) + rT c . 

We now apply Theorem U] on the circuit with parameter n~ c /3. This yields a pa- 
rameter (the subscript indicates that it is from Theorem H]) . We claim that 

< 5 almost surely. (21) 

To see this, we assume otherwise and obtain a contradiction. In Experiment 2, Let Tj be 
the communication produced by the weak protocol W in round i. Assuming all sets S** in 
the execution are of size at least 5 (this happens with probability at least 1 — n~ c /3), the 
tuples (bi,Ti) are J-hiding random variables. Consequently, when the circuit simulates 
the information theoretic protocol I using bits 6j, it actually simulates it in an instance in 
which it was designed to be used. Since (|19p holds for an arbitrary adversary in this case we 
get that 

Pr[C^ outputs 1 in Experiment 2|No set S* was of measure less than 5] < — + f](k). (22) 

Therefore, the probability that outputs 1 in Experiment 2 is at most ^ + rj(k) + and 
using "indistinguishability" the probability that outputs 1 in Experiment 1 is at most 
5 + r](k) + • However, our assumption was that the probability that outputs 1 is at 
least | + i](k) + n~ c , and so almost surely Gen does not output such a big establishing 

(EH). 

Theorem H] also provides us with a non-rewinding circuit Q which treats x obliviously and 
which satisfies "predictability" . We explain how to use Q to break f)18|) , the security property 
of the weak protocol W. 

Since Q(x) is non-rewinding, it uses the input x exclusively in a fixed position i, together 
with a fixed prefix (x%, . . . , x»_i), in all calls to C^ k \ We first extract i and the prefix. 

We now explain a crucial point: how to interact with Ay/ in order to cheat. We simulate 
the i — 1 interactions of Aw with B s up to and including round i — 1 using {x\, . . . , as 
the input bit and randomness of A. In round i, we continue with the actual interaction with 
Aw. Here, Aw uses randomness x (on which we, however, do not have access). 

After this interaction, we need to be able to extract the bit c of Aw- For this, we evaluate 
Q(x), which we claim is possible. Since Q is oblivious and deterministic, the only difficulty is 
in evaluating the calls to C^ k \x\, b\, . . . ,Xk,bk,r). All calls use the same values for x±, . . . , Xi. 
Recalling how is defined, we see that we can continue from the state we had after the 
interaction with Aw in order to evaluate completely (note that all the b{ are given, so 
the we can also evaluate the information theoretic protocol I). 

We get from Theorem U] that Q satisfies, almost surely, infinitely often, using (|2ip 

r 1 

Pr \Q(x) = P(x)] > 1 - - + — — . (23) 
^ {0 ,i}n LVV 1 y n ~ 2 A8kn c V ' 
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This therefore gives a contradiction to f)18[) : in order to get rid of the "almost surely", we 
just consider the algorithm which first runs Gen and then applies the above protocol - this 
only loses a negligible additive term in the probability. □ 

4 Weakly Verifiable Puzzles 

4.1 Interactive Weakly Verifiable Puzzles 

Consider a bit commitment protocol, in which a sender commits to a single bit b. In a first 
phase the sender and the receiver enact in an interactive protocol, after which the sender 
holds some opening information y, and the receiver has some way of checking whether (y, b) 
is a valid decommitment. If the protocol is secure, then it is a computationally hard problem 
for the sender to come up with two strings yo and y\ such that both (yo>0) and (j/i,l) are 
valid decommitments, in addition, he may not even know the function the receiver will use to 
validate a decommitment pair J! and thus in general there is no way for the sender to recognize 
a valid pair (yo,yi). We abstract this situation in the following definition; in it we can say 
that the solver produces no output because in the security property all efficient algorithms 
are considered anyhow. 

Definition 9. An interactive weakly verifiable puzzle consists of a protocol (P, S) and is given 
by two interactive algorithms P and S, in which P (the problem poser) produces as output 
a circuit T, and S (the solver) produces no output. 

The success probability of an interactive algorithm S* in solving a weakly verifiable puzzle 
(P,S)is: 

PT[y=(P,S*)s*;r(y) = l} (24) 
The puzzle is non-interactive if the protocol consists of P sending a single message to S. 

Our definition of a non-interactive weakly verifiable puzzle coincides with the usual one 
[CHS05 . The security property of an interactive weakly verifiable puzzle is that for any 
algorithm (or circuit) S* of a restricted class, the success probability of S* is bounded. 

An important property is that S* does not get access to T. Besides bit commitment 
above, an example of such a puzzle is a CAPTCHA. In both cases it is not obvious whether 
a given solution is actually a correct solution. 

4.2 Strengthening interactive weakly verifiable puzzles 

Suppose that g is a monotone boolean function with k bits of input, and (P^\ S^) is a 
puzzle. We can consider the following new puzzle (P»J , S^): the sender and the receiver 
sequentially create k instances of (P^\ S^- 1 '), which yields circuits . . . , for P. Then 
P^ outputs the circuit TW which computes T^(jjx, . . . = g(T^(y{), . . . ,T^ k \yk)). 

5 One might want to generalize this by saying that in order to open the commitment, sender and receiver 
enter yet another interactive protocol. However, our presentation is without loss of generality: the sender 
can send the randomness he used in the first protocol instead. The receiver then checks, if this randomness 
together with b indeed produces the communication in the first round, and whether in a simulation of the 
second protocol he accepts. 
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Intuitively, if no algorithm solves a single puzzle (P^\ S^) with higher probability 
than 8, the probability that an algorithm solves (P^ 9 \S^ 9 ') should not be more than ap- 
proximately Pr n< _^ft [d{u) = 1]. (Recall that fj,g is the distribution on /c-bits, where each bit 
is independent and 1 with probability 5.) The following theorem states exactly this. 

Theorem 10. There exists an algorithm Gen(C,g,e,5,n) which takes as input a circuit C, a 
monotone function g, and parameters e, 5, n, and produces a circuit D such that the following 
holds. If C is such that 

PrfTGrt (( P (s) ( C )c) = 1] > Pr [g(u) = 1] + e, (25) 

then, D satisfies almost surely, 

Pr[T^((P^,D) D ) = l}>5+-^. (26) 

Additionally, Gen and D only require oracle access to both g and C, and D is non-rewinding. 

Furthermore, size(D) < size(C) ■ ^ log(^) and Gen runs in time poly(/c, \,n) with oracle 
calls to C . 

The monotone restriction on g in the previous theorem is necessary. For example, consider 
g(b) = 1 — 6. It is possible to satisfy g with probability 1 by producing an incorrect answer, 
but Pr u< -^[g(u) = 1} = 1 - 5. 

4.3 Proof of Theorem flQl 

Algorithm Description If k = 1, Gen creates the circuit D which runs C and outputs its 
answer. Then either g is the identity or a constant function. If g is the identity, the statement 
is trivial. If g is a constant function, the statement is vacuously true. D is non-rewinding. 

In the general case, we need some notation. For b G {0, 1}, let Qb denote the set of in- 
puts Gb := {b±, . . . , bk\g(b, 62, • • • , 6fc) = 1} (i-e., the first input bit is disregarded and replaced 
by b). We remark that Qq C Q\ due to monotonicity of g. We will commonly denote by 
u = u\U2 ■ ■ ■ Uf, G {0, l} k an element drawn from /i^. After a given interaction of C with P^ 9 \ 
let c = c\C2 • • • Cfc G {0, l} k denote the string where q is the output of on input yi, which 
is the ith output of C. We denote the randomness used by in execution i by 7Tj. 

For it* , b G {0, l} n x {0, 1} we now define the surplus S w * : b- It denotes how much better 
C performs than "it should", in the case where the randomness of P^ in the first instance 
is fixed to it* , and the output of T^(yi) is ignored (i.e., we don't care whether C solves the 
first puzzle right), and b is used instead: 

5 W ., 6 := Pr [c G G b \m = it*] - Pr [u G Q h ], (27) 

7r( fe ) u-i—fj,^ 

where the first probability is also over the interaction between and C as well as random- 
ness C uses (if any) . 

The algorithm then works as follows: first pick ^ log(n) candidates it* for the randomness 
of P^ in the first position. For each of those, simulate the interaction (P^ 9 \C) and then 
get estimates SV*^ and S n * ; x of SV*^ and S n * ; x such that {S^^ — SV*^! < ^ almost surely. 

We consider two cases: 
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• One of the estimates satisfies S^* t b > (1 — 

In this case, we fix ir± := tt* and c\ := b, and invoke Gen(C",</,(l — x)e, 5, n), using 
the function g'fa, ■ ■ ■ , = g(c\, 62, • • • , &&) and circuit C which is defined as follows: 
C first (internally) simulates an interaction of with C, then follows up with an 
interaction with P^ 9 '\ 

• For all estimates S x * i, < (1 — 

In this case, we output the following circuit D c : in a first phase, use C to interact 
with pW. In the second phase, simulate k — 1 interactions with and obtain 
(2/1, — , j/jfe) = C(x,x 2 , ■ ■ ■ ,x k ). For i = 2, . . . ,/s set a = r,(yj). If c = (0, c 2 , . . . , c k ) G 
£/i \ Go, return y%, otherwise repeat the second phase ^log(^) times. If all attempts 
fail, return the special value _L (or an arbitrary answer). 



Overview of Correctness The interesting case is when Gen does not recurse. In this case 
we know that C has higher success probability than Pi u ^ fM k[g(u) = 1], but for most tt*, the 

surpluses SV-.o and S^* t i are less than (1 — V)e. Intuitively, then C is correct on the first 
coordinate unusually often when c G Gi — Go (as this is the only time that being correct on 
the first coordinate helps). If we could assume that 1) that the algorithm always outputs an 
answer, and 2) for every tt* , the surpluses, S w *fi and S w *,i are less than (1 — r)e, then the 
theorem would follow by straight-forward manipulations of probability. 

Unfortunately these assumptions are not true, but the proof below shows that because 
these assumptions only fail slightly, not much is lost. Informally, Equations I304f35l show that 
if the algorithm fails to output an answer it is either because Pr^jfc G Gi — Gol^i = n *] 
is very small (in which case this tt* will not contribute much anyhow), or because we are 
unlucky (which happens with very small probability) . Additionally, Equations [371I4T1 show 
that because we did not find a tt* with large surplus, we can assume that (unless we were 
very unlucky) there are few tt* with large surpluses, which cannot have undue influence. 

Analysis of Correctness Consider first the case that we find (x*,b) for which S x *^ > 
(1 — ^)e. We can assume that S x * t b > (1 — r)e, since the error is at most e/(4k) almost 
surely. Thus, we satisfy all the requirements to use Gen with k — 1 (using x* as the first 
input and g(b, •) as the monotone function with k inputs), which will return a non-rewinding 
circuit for which Pr( Xj p) ir [T(D(x, r)) = 1] > 8+ (1 — ^)e/6(k — 1) = 5 + e/6k. The remaining 
properties are easily verified. 

The more interesting case is if Gen does not recurse. First, we get, for any puzzle tt* = 
(x*,T*) (simply using and Go Q Gi). 

Pr [u G Si ~ Go] = Pr [c G Gi - Gofri = tt*} - (5^,1 - S n * fi ) (28) 

and thus, still fixing tt* and multiplying by Pv r [T* (D(x* , r)) = 1]/Pr , k[u G Gi — Go}'- 

Pr r [T*(D(x*,r)) = ljPr^fc G Gi - G0W1 = tt*} 



Pr[r*(£>(x*,r)) = 1] 



Pr u^/4> e Gi-Go] 



Pr r [r*(J)(x*,r)) = l](^ |1 -5^,o) j 
Pr u ^ M fe [ueGi- Go} 
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We bound the first summand in f)29|) : 

Pr[T*(D(a?V)) = 1] Pr [c6ft- SoKi = vr*] 



r(fc) 



Pr[T*(D(x*,r)) ^ J_] Pr [ Ci = l|c G Si - So,vri = vr*] Pr [c e & - Soki = vr*]. 

r ^(fc) ^(fc) 



(30) 



If Pr[c€&-0oki = vr*] < A, then > Pr[ Cl = l|c G Si - So, x x = tt*] Pr[c G Si - Soki 



7T 



6fc 



. If Pr[c G Si - SoKi = tt*] > £ then Pr[r*(C(x*)) 7^ ±] > 1 



(,// 



since D only 



outputs _L if after ^- log(6fc/e) none of the elements c was in Si — So- In both cases: 
Pr[r*(D(x*,r)) ^ i_] Pr [ci = l|c G Si - So,vri = vr*] Pr [c G Si - Soki = vr*] 

r K (h) n (k) 



> Pr [ci = l|c G Si - So,tti = vr*] Pr [c G Si - SoKl = vr*] 

7rO) 7rC=) 



e 

6fc 



Pr [ci = 1 A c G Si - Soki = vr* 

7r(*0 



e 

6fc 



I I * 

l|7Tl = vr 

I I * 
1 7Tl = vr 



Inserting into ([29]) gives 

E w *[Pr[r*(L>(x*,r)) = l]] > E n 



Pr [c G S0K1 = vr*] —— 

ttW OK 

Pr [tt € So] - - -^r 

U*r- fig OK 



(31) 
(32) 
(33) 
(34) 
(35) 



Pr 7r(*)b(c) = l|vri = vr*] - Pt k[u G So] 



6fc 



Pr «^ M j g Si — So] 
5 ff .,o + Pr r [T*(£)(x* ) r)) = l](S n * A -S n 



*,oj 



Pr u ^ M fc [tt G Si — So] 



(36) 



We bound the second summand of (|36|) . Consider the set W of puzzles for which both S n *,\ 
and SV*^ are not very large. Formally: 



5 T ,o < (1 



2k' 



e) A (^,1 < (1 - -) 



(37) 



Almost surely, //(W) > 1 — ^: otherwise Gen would accept one of the sampled puzzles almost 
surely and recur se. Thus, we get 



E w .[5 T .,o + Pr[r*(D(x*,r)) = l](5. 



7T*,1 



^7r*,o)] 



<— + E^^ w [5^, + Pr[r*(D(x*,r)) = - S^o)] 

ok r 

< ± + E^w[^.,o + Pr[r*(D(x*, r)) = 1]((1 - -L)e - S^.o)] 

< gjr; + E 7r*.«-w['S7r*,o + ((1 - - <S 7r *,o)] 



e. 



(38) 
(39) 
(40) 
(41) 
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We insert into ([36]) (and then use Pr[g(u) = 1] = Pr[u G Go] + 5Pt[u G Gi — Go}) to get 

Pr[r(D(x),r) = 1] (42) 

7r,r 

-Pr ff W [9(c) = Iki = 7T*] - Pr,^* [« G O ] - 4 (1 - ^)e 



> EL 



Pr t«-Ai*b( u ) = !]+£- Pr u ^fe[u G Go] - (1 - ^)e 



Pr M ^| [ueGi- Go] 
5Vr u ^[ueGi-Go] + -k | e 



This concludes the proof of Theorem [TUJ 



(43) 

(44) 

(45) 
□ 



5 Example: Bit Commitment 

Theorems [8] and [10] can be used to show how to strengthen bit commitment protocols. We 
explain this as an example here. Assume we have given a weak bit protocol, where a cheating 
receiver can guess a bit after the commitment phase with probability 1 — f' an( ^ a cheating 
sender can change the bit he committed to with probability a. We show that such a protocol 
can be strengthened if a < /3 — 1/ poly(n). 

We should point out that a different way to prove a similar theorem exists: one can 
first show that such a weak bit-commitment protocol implies one-way functions (using the 
techniques of pL89] V The long sequence of works |HILL991 INao91| IRomflOt INOV061 IHR07] 
imply that one-way functions are sufficient to build bit commitment protocols (the first two 
papers will yield statistically binding protocols, the last three statistically hiding protocols). 
However, this will be less efficient and also seems less natural than the method we use here. 

In the following, we first define weak bit commitment protocols. We then recall a Theorem 
by Valiant [Val84j . and then show how to use it to strengthen bit commitment. 



5.1 Weak Bit Commitment Protocols 

We formalize a "weak" bit commitment protocol between a sender and a receiver by con- 
sidering algorithms S(b,rs) and R(tr), where b is the bit which the sender commits to, 
and r$ and tr are the randomness of the sender and receiver respectively. We denote by 
T(S(b,rs) *H> i?(rg)) the communication which one obtains by running S(b,rs) interact- 
ing with R(tr). Also, (S(b,rs) o R(tb))s denotes the output which S produces in such 
an interaction, which for an honest S will be used later to verify the commitment. Let 
(S(b,rs) o R(tb))r denote the output receiver R produces which can be thought of as a 
guess of b. 

Definition 11. An a-binding /3-hiding bit commitment protocol consists of two randomized 
interactive TM S(b,rs) and R(rn), as well as a check-algorithm Re, with the following 
properties. 

Correctness The protocol works if both parties are honest. More concretely, for 7 = 
T(S(b,r s ) o R{vr)) and r = (S(b,r s ) o R(r R )} s we have that Rc(b,^,r) = 1 
with probability 1 — negl(n). 
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Binding A malicious sender cannot open the commitment in two ways: For any randomized 
polynomial time machine S*(rs), setting 7 := T(S*(rs) O R(tr)), the probability that 
S* outputs ro and t\ such that Rc(0,^,tq) = 1 and Rc(l, 1, t\) = 1 is at most a. 

Hiding For any randomized polynomial time machine R*, Pr [(£?(&, rs) O R*(tr))r = 6] < 
1 — 4> if b is chosen uniformly at random. 

If a protocol is l/p(n)-binding and 1 — l/p(n) hiding for all polynomial p(-) and all but finitely 
many n we say that it is a strong bit commitment protocol. 

We point out that our notation is chosen such that for a strong bit commitment scheme, 
a — > and f3 — > 1. Given an a-binding /3-hiding bit commitment protocol, we would like 
to use it to get a strong bit commitment protocol. By a simulation technique [DKS99j this 
is impossible if a > /3 (there is a simple protocol which achieves this bound for semi-honest 
parties without any assumption: with probability 1 — a the sender sends his output bit 
to the receiver, and otherwise neither party sends anything). Our results will show that 
if a < (3 — l/poly(n) then such a strengthening exists. Previously, such a result was only 
known for a < (3 — 1/ polylog(n) [HR08] (if one is restricted to reductions in which the parties 
can only use the given protocol interactively, and not to build a one-way function). 

5.2 Monotone Threshold Functions 

Given a weak protocol (S, R), we will transform it as follows: the parties will execute (S, R) 
sequentially k times, where the sender uses random bits as input. Then, they will apply 
an "extraction protocol", which is made with the following two properties in mind: a party 
who knows at least 1 — a fraction of the committed bits will know the output bit almost 
surely; a party who has no information about 1 — /3 fraction of the input bits will have no 
information about the output bit almost surely. It turns out that such an extraction process 
can be modeled as a monotone boolean circuit, where every wire is used in at most one gate 
(i.e., read-once formulas). 

To get such a circuit, we use the following lemma. It can be obtained by the techniques 
of Valiant [Val84]. Also, it appears in a more disguised form as Lemma 7 in [DKS99J (where 
it is used for the same task we use it here, but not stated in this language). 

Lemma 12 ( |Val84l [DKS 99J ) . Let a, (3 with a < (3 — l/poly(n) be efficiently computable. 

There exists a k £ poly(n) and an efficiently computable monotone circuit g(mi, . . . ,m/%) 
where every wire is used in at most one gate and such that 

Pr[g($) = l]>l-2- n (46) 

and 

Prfo(/4) = 1] < 2- (47) 

5.3 Strengthening Bit Commitment 

We come to our result of this section. 



21 



Theorem 13. Let (S, R) be an a-binding and (3-hiding bit commitment protocol for polyno- 
mial time computable functions a and /3 with a < j3 — 1/ poly(ra). Then, there is an oblivious 
black-box construction of a bit commitment scheme (Sq ,Rq). 

Proof. Let g be as guaranteed by Lemma [12] for these parameters a, f3, and k the input 
length of g. The players run k instances of (S, R) sequentially, where the sender commits 
to a uniform random bit c% in instance i. We associate each q to one of the input wires. 
The sender then runs the following "extraction protocol", in which he uses additional vari- 
able^! Cfc+i, . . . , C2fc-i- We associate those with the other wires in The sender then 
traverses g as if he were evaluating the circuit. When encountering a gate with input wires i, 
j, and output wire £, he distinguish two cases. If the gate is an OR gate, set q = Cj © Cj. 
If the gate is an AND gate, the sender sets C£ to be a completely new random value and 
sends ce q and ct © Cj to the receiver. Once the sender "evaluated" g in this way, he sends 
b © C2fe_i to the receiver (where b is the input to the sender, and C2k-i is the bit associated 
with the output wire of g). 

To open the commitment, the sender sends all the opening information for the individ- 
ual positions to the receiver. The receiver then checks if the extraction phase was done 
consistently, and accepts if all these tests succeed and the output matches. 

Hiding: We would like to use Theorem El For this, it only remains to argue that 
the extraction protocol is information theoretically secure. For any /3-hiding random vari- 
ables, we define a random variable H over {0,1} by fixing Pt[H = 1\X = x,Z = z] = 
mm(Pr[X-0,Z-2],Pr[X-i,Z-z]) ^ Q ne checks that for any function / : Z — > {0,1} we have 

Pr[/(Z) = X\H = 1] = \ and Pv[H = 1] = 1 — f (the point of H is that it is 1 ex- 
actly if Z gives no information about X, and furthermore H is often 1). We get random 
variables Hi, ... , Hk in this way, and evaluate the circuit g{H\, . . . , Hk). One sees per induc- 
tion that Zi, . . . , Zk together with the communication produced gives no information about 
the bit corresponding to a wire iff the corresponding value when evaluating g(H\, . . . , H/,) is 
one. Since the probability that the output is 1 is 1 — 2~ n , we get the information theoretic 
security. 

Binding: We can interpret the bit commitment protocol as an interactive weakly verifiable 
puzzle: in the interaction, the receiver is the person posing the puzzle, and the sender is the 
solver. In order to solve the puzzle, the sender needs to send two valid openings to the 
receiver. 

In order to break the resulting puzzle, the sender needs to solve the subpuzzles in all 
positions a{ for some input for which g[a\, . . . , a&) = 1. Using Theorem [10] for 5 = (3 thus 
gives the result. □ 
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